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(5*^ Abstract 

A method for increasing the translation efficiency of a mRNA sequence is provided. In certain ^^^^V^^^^^^"^ em- 
bodiments, the method comprises the steps of: (a) constructmg apredicted secondary structure for the mRNA; <b) analyz- 
ing the predicted secondary structure to determine if either or both of the AUG initiation codon and the Shme-Dalgarno 
sequence is contained in a double stranded portion of a stem-loop region of the predicted secondary structure;Xc) calculat- 
ing a free energy value for the stem-loop region ; and (d) if the calculated free energy value w m the range of from zero to 
about -10.akcal/mole, modifying the DNA sequence for the mRNA so that when the modified sequence is transcribed it 
produces a modified mRNA sequence which has a predicted secondary structure Wherein the AUG initiation codon and 
the Shine-Dalgarao sequence are not included in a double stranded portion of a stem-loop region of the predicted secon- 
dary structure. By means of this method, ten-fold inareases in protein production have been achieved. 
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METHOD FOR IMPROVING TRANSLATION EFFICIENCY 



BACKGROUND OF THE INVENTION 

1. Field of the Invention 

. 5 This invention relates to a method for increasing 

the production of proteins by biological cells and, in 
particular, to a method for improving the efficiency 
. with which messenger RNA (mRNA) is translated. 

2. Description, of the Prior Art 

20 Extensive research has been performed to attempt 

to understand why some messenger RNAs are translated 
more efficiently than other messenger RNAs. For 
example, studies using E> coLi have been performed in 
which nucleotide sequences of varying lengths have been 

15 placed between the AUG initiation codon and the 

Shine*Dalgamp . (SD) sequence* See Shepard et al. , 
"Increased Synthesis in E, coli of Fibroblast and 
Leukocyte Interferons Through Alterations in Ribosome 
Binding Sites ^" mA, 1:125-131 (1982). These studies 

2Q identified an optimal spacing of 9 nucleotides. <As 

known in the art, the Shine-Dalgamo sequence comprises 
3-12 nuisleotides which are found in prokaryotic mRNAs 
and which are complementary to the 3* end of 16s rRNAp> 
Similarly, the effects of base composition in the 

25 region between the SD sequence and the initiation codon 

have been studied. See De Boer et al. , "Portable 
Shine-Dalgamo Regions: A System for a Systematic 
Study of Defined Alterations of Nucleotide Sequences 
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within E, coli Ribosdme Binding Sites," ONA, 2:231-235 
(1983) • - In this, case, it was found that, the presence 
of A or T residues resulted in higher levels of mRNA 
translation, while C or G residues led to lower, levels 
5 " of . translation. ' . 

In addition to the foregoing, various studies have 
-been performed to determine the effects of mRNA 
secondary structure: on translation efficiency.. See,, 
for example, Boyen et al. , "Enhancement of Translation 
10 Efficiency in- Es cher ichia coli by Mutations in a 

Pro2cimal Domain, of Messenger RNA," J, Mol, Biol. . 
162:715-720 < 1982); Col^an et al., "Mutations Upstream 
of . the Ribosome-binding Site Affect Translational 
Efficiency:". J.. Mol. Biol. . 181:139-143 (1985):; Hall et 
15 ai. , "A Role for mRI^ -Secondary Structure iii the 

Control of Translation Initiation," Nature ^ 2955 616-618. 
(1982) J Iserentant et al. "Secondary Structure of mRNA 
and .Efficiency of Translation Initiation,'^ Gene . 9:1-12 
(1980); ;Kastelein et al.., "Effect of the sequences 
20 upstream from the ribo some-binding site on the. yield of 

protein- from the cloned gene for phage MS2 coat 
protein," Gene . 23:^245-254 (1983) ,- Queen et al. , 
"Differential Translation Efficiency-, Explains 
Discoordinate Expression of the Galactose Operon," 
25 Ceii j 25:241-249 (1981) > Schoner et al. * "Role of mRNA 

translational efficiency in bovine growth hormone 
.expcesslon In . E . coll , " Proc. Nat. Acad. Scl. . USA, 
81r5403-5407 (1984>; Shepard et al. , supra ; and Steege, 
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D., "5 '-Terminal nucLeotide sequence of Escherichia 
coii lactose repressor nRNA: Features of translational 
initiation and- reinitiation sites," Proc. Natl. Acad.. 
,- Sci. ..USA, 74:4163-4167 (1977). 
5 In these studies, A-U and G-C base pairing has 

been used to construct mRNA secondary structures from 
. mRNA primary structures. The secondary structures have 
theii been examined to determine if either or both of 
the translation initiation determinants, e.g.., the SD 
IQ. sequence and the initiation codon , appear in a double 

stranded portion of the folded mRNA molecule, as 
opposed. to a single stranded portion. In addition, the 
thermodynamic stabilities of the secondary structures 
have been calculated using the methoda of Tinoco et al. 
15 See Tinoco et al.. "Improved Estimation of Secondary 

Structure in Ribonucleic Acids," Nature New Biology. 
246:40-41 (1973)? Tinoco et al. , "Estimation of 
...Secondary Structure in Ribonucleic Acids", Nature , 
230:^62^367 (1971) » and Borer et al.. "Stability of 
20 Ribonucleic Acid Double -Stranded Helices," J. Mol. 

Biol. . a6: 843-853 (1974). See also Salser, W. , Cold 
S pring Harbor ^rr- O^^nt. Biol.. 42:985-1002 (1977). 
, Lower levels of protein production have generally been 
. found to correlate with the sequestering of translation 
25 initiation determinants in double stranded regions of 

thermodynamically stable secondary structures, i.e., 
secondary structures having a- calculated free energy 
(AG) more negative than about -10.0 kcal/mole, i.e., 
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free energies more negative than the free energy of 
hydrolysis of ATP (AG * -7^3 kcal/mole) . 

Significantly, with regard to the present, 
invention, the sequestering of translation initiation 

.5 . determinants in double stranded regions of secondary 

structures . having . calculated free energies more 
positive' than" -10.6 kcal/mole have not been considered 
iiiportatxt with regard to changes in translation 
efficiency. For example, in Hall . et al. , supra, a SD 

10 \ sequence . sequestered in a . secondary structure having a 

. free energy of ^2.^0 kcal. was considered accessible fdr. 
ribo^omie binding due to the instability of. the 
secondary structure.. Similarly,, in Xserentant et ai. , 
supray an SD sequence was considered largely single 

15 . . ^stranded where release of the sequence from a double 
.strand configuration only required 3.2 kcal. Along 
these same lines, in De Boer et al. ^ supra, it was 
- stated that the observed differences in translation 
efficiency which these workers had found could not be 

20 explained on the basis of secondary structure because 

signif icaiitly stable stem- loops having AG values less 
than • 10 kcal could not be. found. Again, , in Coleman et 
.al»; supra,, a. secondary structure havlTig a^ free energy 
ot •'3 A 05 kcal was. considered unlikely, to significantiy' 

.25 affect ribosome binding to mRNA. 

In contrast to these views of prior^ workers, as 
discussed and demonstrated in detail below, in 
accordance with the present invention it has been 
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surprisingly found that translation efficiency can be 
significantly affected, e.g.. by a factor of 10. by the 
sequestering of . translation initiation determinants in 
secondary structures having calculated free energies 
5 more positive than -10.0 kcal/mole, e.g., on the order. 

of -3.9 kcal/mole. As shown below, based on this 
. discovery, even greater increases l,n protein production 

can be achieved than would have been achieved based on 
. the prior understanding of 'the effects of mRNA 
.10 secondary structure on translation efficiency. 

. SUMMARY OF THE INVENTION 

In. view of the foregoing state of the art, it is 
. an object of this invention to increase the production 
of protein by biological cells. More particularly, it 
15 . is an object of the invention to increase the 

efficiency with which mRNA is. translated to produce 
protein. 

To achieve the foregoing and other objects, the 
invention provides a method for increasing the 
20 translation efficiency of a mRNA sequence which is 

produced from a DNA or RNA sequence comprising the 
steps of: 

(a) determining and analyzing a. predicted 
secondary structure for the mRNA to determine if either 
25 or both, of the AUG initiation codon and the 

Shine-Dalgamo sequence is contained in a double 
stranded portion of a stem-loop region of the predicted 
secondary structure r 
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(b) calculating a free energy value for the., 
stem- loop region; and. 

'(c) modifying the DNA or RNA sequence for the 
• mRNA> if the calculated free energy value is in the 
5 range of from zero to about -10,0 kcal/mole, so that 

the modified DNA or RNA sequence produces a modified 
mRNA sequence which has a predicted secondary structure 
. wherein either? 

(i) -the . AUG initiation codon and the 
10 . Shihe-Dalgarnd . sequence are not included in a 

, . double Stranded portion of a stem- loop region 

of the predicted secondary . structure; or 

(ii) . if either or both of . the AUG initiation codon 

and the ShxnerDalgamo sequence . are included 
X5 iti a double stranded portion of a stem- loop 

reigibn of the predicted secondary structure, 
the calculated free energy value for such 
stem-loop region is more positive than the 
free energy value calculated in step Cb)^ 
20. As described in" detail below, by. means of the 

foregoing procedure, ten-fbld increases in protein 
production have been achieved^ 

In certain preferred embodiments, the ^. free energy 
. value calculated - in step (b> is in the range of from 
25 .zero, to about -7.0 kcal/mole, i.e. , the calculated free 

energy -is more positive than the free energy of 
hydrolysis of ATP. This, range is particularly likelj^ 
to have been ignored by prior art workers since it 
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represents structures whose energy is less than the 
energy of hydrolysis of only one ATP. 

The accompanying drawings , which are incorporated 
in and constitute part of the specif ication , illustrate 
the application of the invention to the production of 
human otl interferon, and together with the description, 
serve to explain the principles of the invention. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
' Figure .1 shows the construction, of . hybrid plasmi'd , 

pNL014. Abbreviations used: Ipp . ^ = coli 

lipoprotein promoter; Ipp ^ ^o^^ lipoprotein 

:5 ■ . transcription .terminator, A^ = Ampicillin resistance 

gene, 6al ^ » Yeast galactose protnoter, al « mature 
'\ interferon al gene, Met-al - methionyl-al, al^ - 

Interferon a I transcription terminator, SUC ^ = Yeast 

invertase transcription terminator, Xb = Xbal , E = 
1& EcoR I. H « HindXII, S ° Sail. C =» Clal, 2u « Yeast 2u 

replication origin. URAS Yeast URAS gene, and a, a. = 

amino, acid. 

Figure 2 shows the construction of hybrid plasmid 
pNLOlS. Abbreviations: as in Figure 1. pIN-I 
15 pIN-I-Aj vector. Marked bases are those which do not 

appear in pNLO 08 * 

Figure 3 shows the construction of hybrid plasmid 
pNLO0S> Abbreviations.: as in Figure 1. SI « SI 
nuclease* Marked bases are those* which do not appear 
2Q[ in pNlOlS. EcoR I linker (GAATTC) was obtained from New 

England Bio labs • 

Figure .4 shows the predicted secondary structures 
for mRNA produced from plasm±ds pNLOlS and pNLOOS, The 
sequences start with the first base (1) of the 
25 transcripts.. Sequences under the broken lines are the 

.. . * Shine-^Dalgamo region. The initiation codon is 
. indicated by a heavy bar and the deleted or inserted 
sequences are marked by a light bar. The AUG of the 
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methiqnyl interferon is marked by dots. Calculated 
free energies of the secondary structures are -3.9 
■ kcal/mole and - 3.2 kcal/mole for pNLOlS and pNLOOS 
transcripts, respectively. Arrows indicate base 
5 . . substitutions as occurred in pNL016 and pNL017. 

Figure 5 shows the results of a pulse-chase 
analysis of the in vivo stability of interferon fusion 
protein produced by JA221/pNL008 (lanes 1 - 4) and 
JA221/pNL015 (lanes 5-8). After coli cells were 

10 . incubated with [^^Sl- methionine for 40 minutes, a 

40,000-fold excess of xinlabeled methionine was added. 
At. various chase times, samples were withdrawn and cell 
. lysates were prepared . for PAGE and immunostaining. 
Lane 0 contains molecular weight standards. Chase 

15 times were lanes 1 and 5: 0 min. j lanes 2 and 6: 15 

min. ; lanes 3 and 7: 30 min. s lanes 4 and 8: 60 min. 
Alpha 1 indicates the position of IFN al fusion 
protein. The M.W. of IFN fusion protein from pNL015 
was slightly higher than that from pNLOOS due to 3 
20 extra amino acid residues . 

Figure 6 shows the results of RNA dot blot 
hybridization experiments for JA221/pNL015 and 
JA221/pNL008. Columns 1 and 2 represent serial 
dilution (1:5) of total cellular RNA (top row 40 ug) of 
25 JA221/pNL015 and JA221/pNL008, respectively. The. total 

RJJA was isolated following the procedure of Young and 
• Furano and applied to nitrocellulose paper. See Young, 
F. S. and Furano, A. V., "Regulation of the synthesis 
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of cali eioirgation factor Tu," Cell . 24 (1981) 

6:95^706. Nick- translated EcdRI fragment of IFN al gene 

from pNLOlS was used for hybridization according to 

Maniatis et,al, . See Maniatis, T. , Frits ch, E. F. , and 

5 Sambrodk, J., Molecular Cloning > Cold Spring Harbor 

Laboratory, Cold Spring Harbor, New YprkV (1982). 

Figure : 7 shows the results of stability analyses 

of IFN al inSNA transcribed from pNLOlS (solid 

triangles) and pNLOOS (open triangles). Rifampicin 

10 (Calbiochem) was added. C20Qug/ml) to five 15 ml cell 

cultures of JA221/pNL008 and JA221/pNL015, 

respectively* The cultures were incubated at 37®C for 

5, 10, isl and. 20 nd-n. . each incubation, 50 uCi 

of [ 2^ S] -methionine was added to each culture, and the 

15 . mixtures . were incixbated for another 1*5 min.. The 

35 

control; cultures (zero time> were labeled with [ S]- 
methionine in the absence of rifampicin. Cell extracts 
were ' prepared and analysed by SDS-PAGE. The IFN al 
. peak positions were identified by immunostaining. The 
peaks were cut out. from the gel blot and the 
radioactivities of these fractions w;ere measured in ACS 
scint illation fluid after being solubilized in 6.5 ml 
of a NCS solubilizer. (Amershjam Corp.) - vater mixture 
(9:1) at 50 "^C for 2 hrs. The incorporation rates were 
25 calculated as percentage of. incorporation at zero time. 

: Figure 8 shows the construction of pNL016 and 

pNL017. Abbreviatioria as in Figure !• Oligonucleo- 
tides were synthesized by using . Applied Biosystems DNA 
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Synthesizer aiid purified by gel elution. Following 
phosphorylation by. polynucleotide kinase, appropriate 
pairs of oligonucleotides were mixed and annealed at 
70*C. in the presence of 66 mM Tris.HCl, pH 7.5 / 6 mM 
MgCl2 / 500 luM ATP in a volume of 270 ul. The mixtures 
were cooled to 15 "C during one hour to form DNA 
fragment A or B. Arrows indicate positions of base 
substitution as compared to pNLOlS. 
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: DESCRIPTION OF THE . PREFERRED EMBODIMENTS 

As described above,, the preseut invention relates 
to the discovery that the sequestering of one or both 
of . the AUG initiation codon and the Shine-Dalgamo 
5 sequence ill a double stranded portion of even a . weakly 

bound <relativeiy unstable): njRNA secondary structure, 
i.e. , a secondary strnxcture having a calculated free 
energy more positive than -10.0 kcal/inole, can have 
significant effects on protein production. In 
10: ..particular,: it has been foxmd that the elimination of 

such secondary structures can result in over a 
five-fold increase, e^^g.^ a ten -fold, increase, in 
protein production. 

The first step of the process of the invention 
15 ' . involves determining if all or part of either or both 
of the AUG. initiation codon and the Shine-Dalgamo 
sequence is. cont:ained in a double stranded portion of a 
. stem-loop region of the predicted secondary structure 
for the mRNA (i.e, , the secondary structure predicted 
20 from pairing: of the bases of the . mRNA's primary 

strucliixTje as apposed to. an experimentally observed 
secondary, structuire) . 

* . This determination, is most conveniently performed 
by conducting computer analyses on the base sequence of 
25 the mRNA to. identify one or, in some cases, a group of 

possible secondary structures for the mRNA. When a 
. group of structures is ideixtif ied, a most probable 
secondary structure can normally be selected following 
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techniques known in the art. See Akiyoshi Wada and 

Akira Suyama, "Local Stability of DNA and RNA Secondary 

Structure and its Relation to Biological Functions" 

Prog. Btophys. Molec, Biol. , Vol. 47, pp 113-157 

5 (1986); Nicholls, N. , . The Regulation of Bacillus 

Licfaeniformis Penicillinase? Locating and Sequencing 

the Repr e s sor Gene . Ph.D. Thesis, Rutgers University, 

Dissertation Abstracts International, Molecular 

Biology, Vol. 47/06-B, 1986. In some cases, however, 

IQ £t may be advantageous to further analyze a group of 

possible secondary structures , as opposed to a single 
most probable structure. For example, depending on the 
details of the sequences involved, it may be possible 
to produce a modified mRNA which eliminates binding of 

15 the AUG initiation codon and/or the Shine-Dalgamo 

sequence for a whole family of secondary structures. 
As an alternate to using a computer program, secondary 
structures can he determined visually, i.e. , by 
examining the primary structure and manually aligning 

20. A-U and C-G pairs, and the free energies of such 

structures can be manually calculated using, for 
example f the Tinoeo techniques, supra . 

Examples of parameters which can be considered in 
selecting most probable secondary structures include 

25 percent base inatch, probability of finding as good a 

match in a random sequence of bases of the same length, 
arid secondary structure free energy. With regard to* 
free energy, it is important not to dismiss secondary 
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structures having calculated free . energies more 

positive than -10 . 0 " kcal/mole - as prior workers , have 

done Csee above) ^ since as shown below,, such weak 

secondary structures can significantly affect protein 

5. productiptl. 

A suitable computer- program for determining mRNA 

secondary structures is the SEQ - DNA Sequence Analysis 

System marketed by IntelliGenetics , Inc. , of Palo Alto^ 

'California (hereinafter the ."SEQ; program").' A 

10 description, of this program can be found in Brut lag, et 

al. ^ "SEQ: . A Nucleotide. Sequence Analysis and 

Recombination. System.^' . Nucleic Acids Research , 

10 1 27 9 -29 4 (1982) . As discussed therein, the SEQ 

program uses the Tinoco et al. techniques, supra, to 

15.. calculate free energies.- See also Zuker and Sankov, 

■ Bull., of Math/ Biol. 46:.591>621 (198.4).' 

Once the predicted . secondary structure or 

structures ..have been constructed, they are analyzed to 

determine: 1).- if all or part of either or both of the 

2b AUG initiation codon and the Shine-balgamo sequence is 

. contained in a double stranded portion of a stem- loop 

. region of the secondary structure or structures; and 2) 

if. the free energy of such a stem- loop region is 

betv^een - zero and . about -10.0 kcal/mole. The free 

25 energy can be conveniently calculated using a computer 

program, such as.,, the SEQ program. 

If both of the foregoing conditions are satisfiedf^ 

the- mRNA . sequence is then analyzed to identify 
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modifications to the mRNA . sequence, i.e., base 
additions, deletions, and/or substitutions, which will 
result in predicted secondary structures wherein the 
AUG initiation codon and the Shine-Dalgamo sequence 
5 are not contained in a double stranded portion of the 

predicted secondary structure. Alternatively, but less 
preferably, the sequence is analyzed for modifications 
. which will , place the AUG initiation codon and/ or the 
Shine-Dalgamo sequence in double stranded portion of a 

10 predicted secondary structure which is even less stable 

than the predicted secondary structure of the original 
mRNA sequence, i.e., in a secondary structure whose 
calculated free energy is. more positive than the 
calculated free energy of the original mRNA sequence. 

25 Various constraints must be kept in mind, in 

considering possible modifications to the mRNA 
sequence. For example, if the protein which is to be 
produced is to remain unchanged, modifications which 
are to be made to the portion of tha mRNA sequence 

20 which codes for the protein are limited to those which 

degenerately code for the same amino acids. See, for 
example, Nussinov, R. , "RNA Folding Is Unaffected by 
the Nonrandom Degenerate Codon Choice", Biochimica et 
Biophvsica Acta . 698:111-115 (1982). Also, when 

25 modifying the protein coding region, it is preferred to 

select codbns which correspond to those tRNAs which are 
most abundant in the host cell which will be used f or^ 
protein production. 
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When modifying noti- coding regions of the mRNA 
sequence, other constraints come into play. For 
escample, it is in general . preferred to avoid changes in 
the Shine-Dalgamo sequence and the spacing between the 
5 AJJQ initiation codon and the Shine-Dalgamo sequence 

-that may adversely affect the translation initiation 
process^ Also, as reported by De Boer et al, , the 
addition of C . or G residues to the region between the 
'.initiation, codon and the Shine -Dalgamp sequence is not. 
10 preferred. Of course, although changes of the 

foregoing types are not preferred, they can be used if 
necessary, to achieve , the desired predicted secondary 
structure. 

once a modified mRNA sequence has been selected, 
15 its ' predicted secondary structure is - determined 

following the same procedures as those used for the 
original mRNA sequence • Again, the structtire, or in 
some cases, structures, are examined to determine the 
* ^locations of the AUG initiation ' codbn and the 
20 Shine rDalgarno sequence, and, if necessary, the free 

... fenergy.of the region of the secondary structure which 
contains these elements is calculated. If necessary, 
different or further modifications of the original mRNA 
sequence are then analyzed until a modified n^NA 
25 sequence, is selected which achieves . the goal of 

. / . minimizing the likelihood that secondary structure will 
/ interfere- : with the - functioning of the Shine-Dalgamxb 
sequence and the AUG initiation codon. 
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Production of the modified mRNA sequence is 
achieved by altering the DNA or RNA sequence which 
codes for the original mRNA. Various techniques can be 
used to produce the modified DNA or RNA sequence. For 
example, site-specific mutagenesis can be used to 
achieve the modifications. See, for example. Messing, 
J.,. "New Ml 3. Vectors for Cloning," Methods in 
Enzvmologv — Recombinant DNA . Wu et al. (eds.), 
: Academic Press, New York, Volume 101, Part C, pages 
20-79 (1983). Similarly, the modifications can be 
produced through the use of synthetic gene fragments 
and nucleases which fragment the DNA or RNA sequence at 
predetermined locations. As will be evident to persons 
of ordinary skill in the art, other techniques which 
15 may be developed in the future for preparing or 

altering, DNA or RNA sequences can be used in the 
practice of the present invention. 

Production of protein using the modified DNA or 
RNA sequence is achieved following standard recombinant 
DNA/RNA techniques. A discussion of these techniques 

- .can be. found in, for example. Molecular Cloning? A 

. Laboratory Manual , by T. Maniatis et al. , Cold Spring 
Harbor Laboratory, Cold Spring Harbor, New -York (1982). 
Again, as with the altering of DNA or RNA sequences, 
the present invention can be used in combination with 
techniques developed in the future for producing 
. proteins from jgenetically altered cells. 

Without intending to limit it in any manner, the 
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inventton will be more fully described by the following 
.examples. The ma,t:^rials and methods which are common 
to. the exao^les are a.s follows. 

MATERIALS AND METHODS 
Strains > plasmids. and plasmid Isolation 

E> coli K- 12 : strain JA221 (hsdM-f hsdR- recA Leu 
LaeY Trp ) was used as the transformation host. pIN 
series . vectors , in particular, PIN-I-A2 and pIN-I-A^^ 
were used as the cloning vehicles. See Nakamura, K. , 
and Inouye, M. , "Construction . of versatile expression 
cloning vehicles using the . lipoprotein gene of 
Escherichia coli ^"- The EMBO Journal . 1:771-775 (1982). 
These vectors use the promoter and the 5' untranslated 
region of the E; coli outer membrane lipoprotein gene 
for transcriptional and translational initiation of the 
cloned, gene. ... 

Plasmid pCGS282, . which was obtained from 
Collaborative . Research, Inc. , Lexington, Massachusetts^ 
was used as a source of a leukocyte interferon a 1 gene 
(see Figure 1). This plasmid is a hybrid plasmid in 
which a mature human interferon ol gene has been 
inserted between, the S. cerevisae galactose promot:er 
(GAL^) and the cerevisae invertase transcription 

terminator (SUC^)* In this construction, 69 base pairs 
coding for the signal sequence of .. the prointerferon 
protrein have been removed * and. replaced by an ATG 
translation initiation codon and Cla I . and Hind Hi 
linkers*. 
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Plasmids were isolated by the method described by 
. Tanaka and Weisblum. Tanaka, T. and Weisblim, B. , 
"Construction of a colicin El-R factor composite 
plasmid in vitro: Means of amplification of 
5 deoxyribonucleic acid/' J. Bacteriol. , 121:345-362 

(1975)- 

. Preparation of cell extracts ^ gel electrophoresis and 
iiiiiituno s t a in ing , 

Cell pellets from 20 ml of log phase coli cell 
10 culture were suspended in 2 ml of 0.1 M phosphate 

buffer (pH 7.5). Cells were broken by sonication for 5 
min. Cell extracts were obtained after 1 hour 
ultracentrifugation at .40,000 rpm at 4*C. 

Sodium dodecylsulfate-polyacrylamide gel 

15 electrophoresis (SDS - PAGE) was performed using a 15% 

slab gel by the method of Anderson et al. Anderson, C. 
W.,, Baum, P. R. and Gesteland, R.F. , "Processing of 
adenovirus 2-induced proteins," J. Virol. , 12 (1973) 
241-252. 

20 A mixture of 20 ul of cell extract and 20 ul of 

loading buffer were loaded on the 15% SDS - PAGE and 
the samples were electrophoresed at 140 V until the 
bromophenol blue dye moved out of the gel. ..The protein 
bands on the gel were transferred to a nitrocellulose 
25 sheet by electroblotting at 70 V for 3 hours at 4^*0. 

The sheet was washed with TBS (20 mtl Tris-HCl, pH 7.5> 
500 mM NaCl) and blocked with TBS containing 3% BSA at' 
room temperature for 30 min. The blocked sheet was 
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incubated overnight in T-TBS (TBS + 0.05% Tween 20) at 
4*C containing a 100 -fold dilution of rabbit polyclonal 
antibody against alpha interferon which was obtained 
f rom^ Interferon Sciences , Inc. , New . Brunswick, New 
5 Jersey. The sheiet was washed three times with T-TBS, 

incubated in a 3000 -fold dilution of BioRad peroxidase 
- conjugated ' goat-anti-rabbit IgG at room temperature for 
Z hours, and. washed three times with TBS. The protein 
bands : were visualized by developing the sheet in a 
IQ freishly : prepared . solution of 0*05 %. 4-chloro-l- 

; naphthbl/ 0.00015 % H2O2 at. room temperature for. 30 m^in. 
The developed sheet was washed four to five times with 
- ' . . distilled water to stop the reaction and then air 
dried. 

15 Example 1 

Construction of Plasmid pNL015 
This example relates to the construction of a 
plasmid (pNLOlS) which produces a mRNA sequence, in 
which the Shine-Dalgamo sequence is contained in a 

20 double stranded portion of. a stem- loop region of the 

sequence's predicted secondary structure. The 
calculated . free-energy of the stem- loop region is -3.9 
kcal/mole, i.e., the calculated free-energy Is in the 
range of free energies which prior workers In the art 

23 thought could not .significantly affect . protein 

production. 

pNL015 was. constructed by first constructing 
pNLOlA by llgatlng a Hlndl ll - Sai l DNA fragment 
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containing the IFN ol gene from pCGS282 to the large 
Hindlll - Sai l fragment bf pIN-I-Ag (see Figure 1). In 
this construction, the promoter, the 5' untranslated 
region of the lipoprotein gene, a sequence coding for 
the first two . amino acid residues of the 
prolipoprotein, and a linker sequence coding for seven 
amino acid residues are situated 5' to the coding 
region of the methionyl IFN ol gene. At the 3' end. 
the IFN ol 3' untranslated region is followed by an 
inyertaae transcription terminator. Accordingly, there 
are twp transcriptional teannination sequences , both 
eukaryotic . in nature, following the ol interferon 

coding sequence. 

The biological activity of interferon isolated 
from coli cells JA221 harboring pNLOlA was measured 
using Vesicular Stomatitis virus (Indiana Strain) on 
HEp-2 cells in a .cytopathic effect assay. See Lee, N. , 
Cqzisitorto, J.. Wainwright. N. and Testa, D. , "Cloning 
with tandem gene systems for high level gene 
^^r-"*"'^ " Nucleic Acids Res . . 12 (1984) 6797-6812. 
The quantity of IFN isolated from these cells was in 
the range of 1.2 x 10* units/ml/OD. 

pNL015 was constructed from pNLOlA ^nd pIN-I-Ag 
following the procedures shown in Figure 2. In this 
plasmid, the two eukaryotic termlnatora of plasmid 
pNLOlA are replaced by a single coli lipoprotein 

terminator. This change was found to result in a two 
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: fold increase in interferon activity (i.e. , to the 
range of 2. 8 x. 10^ units/ml/OD) . 

Exainple 2 

. Constmctioii of a Predicted Secondary 
5 Structure for mRNA Produced from pNLOlS 

. A predicted secondary' structure for mRNA, produced 
from pNLDlS was constructed using the IntelliGenetics 
SEQ program, supra . The- parameter settings employed in 
performing the computer- analysis of the primary mRNA 
10 sequience were the. . default parameters for the SEQ 

pirogram/ i.e. , Percfentmatch « 7Q%, AfterMismatch » 2, 
. LoopQut = 3it MinLoop « 3, and MaxLoop «5X). Various 
. - . Combinations of the Percentmatch, MaxLoop , MinLoop , and 
Af terHismatch paralmeters used in the SEQ program were 
15 foxind .:to predict the same secondary structure as the 

default parameters • 

The predicted sejcondary structure In the region of 
the AUG initiation codon and the Shine -Dalgamo 
sequence CAGAGGGU) obtained by this analysis is shown 
20 'in. Figure 4. The secondary structure shown has a free 

energy of -3.9 kcal/mole, . a percentage match of 
and ^n "E" factor, i^e.; a probability of finding as 
gpod.^ inatch in a rahdom sequence of bases-, of the same 
- length, of 2*387. 
25: . . As. is - evident from Figure 4 , the Shine-Dalgamo 

. sequence is contained in a double stranded portion of a 
stem-lopp region of the predicted secondary structure." 
Moreover , as indicated above ^ the stem- loop region has 
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a calculated free energy in the range of 0 to -10.0 
kcal/mole, i.e., a calculated free energy of -3.9 
kcal/mole. 

Example 3 

5 Construction of Plasmid pNLOOS 

This examples, illustrates the modification of 
pNL015 to produce a plasmid (pNLOOS) which produces a 
mRNA having a predicted secondatTr structure in which 
neither the AUG initiation codon nor the Shine-Dalgamo 
10 sequence are included in a double stranded portion of a 

stem- loop region, of the secondary structure. 

To construct pNLOOS, pNLOlS was linearized with 
Cla l . blunt-ended with SI nuclease, ligated with EcoRI 
linkers and digested with EcoR I. The small EcoR I 
15 fragment was isolated and inserted into the EcoR I site 

of the PIN-I-A2 cloning vehicle as shown in Figure 3 . 

pNLOOS. differs from pNLOlS in tliat the mRNA 
transcript produced by pNLOOS has 11 bases deleted 
starting from, base 17 downstream of the AUG initiation 
20 codon, and a 2 base (A-A) insertion between bases 9 and 

. 10. Other than these differences , the two plasmids are 
identical. 

Figure A shows a predicted secondary s'tructure for 
pNLOOS where the deleted and inserted sequences in 
25 pNLOlS and pNLOOS are indicated by the light bars. The 

structure for pNLOOS shown in Figure 4 was constructed 
manually from the primary structure for the plasmid' s 
first 74 bases. Using the Tinoco technique, the free 
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energy . for ! this structure was calculated manually and 
found; to be . -a.2 kcal/mole. Computer analysis using 
the' SEQ program of the same primary sequence predicted 
a less stable structure haying an energy of -1.7 
kcal/mole in which the SD sequence was partially 
cohtaiued in a double stranded portion of a stem-loop 
region. ... 

is evident from Figure 4, the modifications to 
pNLaiS; have, freed both the initiation codon and the 
Shine -Dalgatno' sequence from double stranded regions of 
the predicted secondary structure. Significantly, this 
: freeing, was fotmd to result in a ten-fold increase in 
the production of ol interferon. Specifically, coli 
■ cells transfbrmed witdv pNLO 08 were found to produce 3 x 
10^ unitls/ml/OD . da compared to only 2.8 x io^ 
units/ml/QD for cells transformed with pNLOlS.' 

■ Example 4 

.. Experiments to Confirm that the Diff erence in Protein 
; Production Between Plasmids pML008 and pNLOlS Is Due 
To the Predicted Secondary Structures of Their mRNAs. 

A series of experiments . were performed to tonfirm 
that the observed difference in protein production . 
between plasmids pNLOOS jand pNLOiS was di*e to their 
predicted secondary structures as opposed to other 
■ factors.. 

First, it :was determined that the,, specific 
activities of interferon encoded by pHLOlS and pNL0d8 
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were the same. In addition, it was determined that 
cells containing pNLOlS had a proportionally lower 
quantity of protein detectable by Western blot analysis 
than, cells containing pNLOOS. Accordingly, it was 
concluded that the difference in N- terminus amino acid 
residues was not a factor affecting the antiviral 
activity of the interferon protein. 

The in vivo stability of the interferon protein- 
was examined in a pulse -chase experiment. The 
interferon proteins encoded by pNLOlS and pNLOOS showed 
no. noticeable degradation within 60 mintites (see Figure 
5 ) . It was therefore concluded that the differential 
rate of degradation by proteolytic enzymes was not a 
. conliributing factor to the observed difference in IFN 

X5 •.' expression. 

Experiments were carried out to quantitate the 
. interferon mRNA synthesized. Dot blot hybridization 
experiments showed that pNLOOS produced slightly more 
interferon mRNA than pNLOlS (see Figure 6). This 
minimal change in the efficiency of transcription,, 
however, was not large enough to account for the 
observed ten fold increase in expression level. 

The stability of the IFN mRNAs was'' studied by 
measuring levels of labeled IFN at various time 
intervals; . after the addition of rifampicin," an 
inhibitor of RNA synthesis, to cultures of 
JA221/pNl.015 and JA221/pNL008. As can be seen in 
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Fig\ir6 7, fxmctional half -Lives of the transcripts 
produced by both pNLOlS* aad pNLOOS are approximately 4 
to 5 minutes ; 

' Based on the foregoing experiments, it was 
5 . concluded, that" none of specific activity,; rate of 

transcription, protein degradation or mRNA degradation 
could account , for the observed difference in expression 
betweeii pNLOOS and pl^ 

In addition to the f oregoing^ two additional 
10 .. plasmids (pNL016 and ..pNL017) were constructed to 

further confirm thar the source of the difference in 
expression between plasmids . pNLOOS and pNLOlS was mHlIA 
secondary , structure. These plasmids were formed by 
synthesizing two EcoRI - Cla I DNA fragments (fragments 
15 A and B) to replace the corresponding sequence in 

: . pNEOlS (bases 50 to 63 in Figure 4). DNA fragment B 
contained a single- base substitution whereby base 60 of 

the pNLOlS transcript was changed from U to C. DNA 

fragment A contained an additional base substitution 
20 whereby both base 60 (U) and base 62 (A) wei^e changed 

■ -to. e.' \ * 

These changes do not change the amino acid 
sequence defined by plasmid pNL015 since each of CUA, 
. GUG, and TIUA code for leucine. Moreover, using the SEQ 

25 program, it was found that the predicted secondary 

structures for the mRNA. sequences corresponding to 
pNL016 and pNL017, like the predicted secondary 
•structure corresponding to pNL015, had the 
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Shine-Dalgarno sequence in a double stranded portion of 
a stem- loop region of the secondary structure. 

The calculated free energy values for the mRNA 
stem-loop regions for the three plasmids, however, were 
5 different- . Specifically, whereas the stem-loop region 

for pNLOlS had a free energy of -3.9 kcal/mole, for 
pNL017, the free energy was -10.8 kcal/mole, and for 
pNLOie, it was -17 kcal/mole. Significantly, coli 
cells transformed with either plIL016 or pNL017 produced 

10 no detectable interferon activity. 

Since all three plasmids code for the same amino 
acid sequence, post-translational factors such as 
specific activity and protein stability cannot 
contribute to the change in interferon titer. 

15 Moreover, substitution of one or two bases in the 

coding region is xinlikely to alter the overall rate of 
transcription. Having ruled out these factors, the 
only possibility left is trans lational inhibition. 
; Trans lational efficiency is unlikely to be affected by 

20 codon usage, since only one synonymous codon 

substitution is involved. Moreover, in coli , the 

codon GUC used in pNL016 is a more frequently used 
codon and the codon CUA used in pNL017 is a less 
frequently used codon in comparison to the* UUA codon 

25 used in pNLOlS. See Maruyama et al. , "Codon usage 

tabulated from the GenBank genetic sequence data," 
. Nucl. Acids Res. , Vol. 14 sup., pages R151-R197 (1986),.^ 
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Yet, both synonymous substitutions resulted in complete 
inhibition of expression.-;. 

These additional results for plasmids pNLQI6 and 
pNL017 add further support to the conclusion that the 
efficiency , with which pNLOlS's mRNA is translated is a 
function of its secondary structure notwithstanding the 
relative instability of that secondary structure. . 
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What is claimed is : 

1. A method for increasing the translation 
efficiency of a mRNA sequence which (i) codes for a 
protein, (ii) is produced from a UNA or DNA sequence, 
• and (iii) includes an AUG initiation codon and a 
Shine-Dalgamo sequence, comprising the steps of: 

(a) determining and analyzing a predicted 
-secondary structure for the mRNA sequence to determine 
if one or both of the AUG initiation codon and the 
Shine -Daigamo; sequence is contained in a double 
stranded portion of a stem-loop region of the predicted 
structure; 

Cb) calculating a free energy value fox the 
stem-loop . region J 

. (c) if the calculated free energy value is in the 
range of from zero to about -10.0 kcal/inole: 
(i) selecting a modified mRNA sequence; 
(ii) determining and analyzing a predicted 
secondary structure for the modified mRNA 
sequence to determine if either or both of the 
AUG initiation codon and the Shine -Da;igamo 
sequence is contained in a double stranded 
portion of a stem- loop region of the predicted 
structure} 

<iii> repeating steps (c) (i) and (c)(ii), if 
necessary, tantil a modified mRNA sequence is 
identified in which neither the AUG initiation 
codon nor the Shine-Dalgamo sequence is 



PCT/US88/02341 



wo 89/00604 



PCr/US88/02341 



contained in a, double stranded portion of a 
. stem- loop region of the predicted secondary 

structxire; and 
(■d) modifying the DNA or RNA sequence so that it 
produces the modified mRNA sequence. 

.2. The method of Claim 1 wherein the free energy 
calculated in. step (b) is. in the range of from zero to 
about. -7.0 kcal/mole* 

3. The method of Claim 1 wherein the modification 
of the DNA or RKA sequence results in the production, of 
at least five times more protein than produced by the 
unmodified DNA or RNA sequence, 

A. . A DNA or RNA sequence which has been modified 
by .the_ method of Claim 1. 

5. A transformed host cell which includes the 
: modified DNA or RMA sequence of Claim .4. 

6. A method for. increasing the translation 
efficiency of a mMTA sequence which (i) codes for a 
protein, Cii) is produced from a DNA or RNA sequence, 
and tiii) includes an AUG initiation codon and a 
Shine-Dalgartio sequence, comprising the steps of: 

(a) determining and analyzing a predicted 
secondary structure for; the mRNA sequence, to determine 
if one or both of the AUG initiation codon and the 
Shine-Dalgarno . sequence is contained in a dovtble 
stranded port ion of a . stem-loop region of the predicted 
structure^. . . 
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(b) calculating a free energy value for the 
stem- loop region; 

(c) if the calculated free energy value is in the 
range of from zero to about -10.0 kcal/mole: 

(i) selecting a modified mRNA sequence; 
(ii) determining and analyzing a predicted 
secondairy structure for the modified mRNA 
sequence to determine if either or both of the 
AUG initiation codon and the Shine-Dalgamo 
sequence is contained in a double stranded 
portion of a stem-loop region of the predicted 
structure) 

(iiij if neither the AUG initiation codon nor the 
Shine-Dalgamo sequence is contained in a 
double stranded portion of a stem-loop region 
of the predicted secondary structure for the 
modified mRNA, modifying the DNA or RNA 
sequence so that it produces the modified mRNA 
sequence; 

(iv) if either or both of the AUG initiation codon 
and the Shine-Dalgamo sequence is contained 
in a double stranded portion of a stem-loop 
region of the predicted struct£^re for the 
modified mRNA, calculating a free energy value 
for the stem-loop region; 
(V) if the free energy value calculated Ln step 
(c.)(iv) is more positive than the free energy 
value calculated in step <b), modifying the 
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... DN4 or RNA. sequence so that it produces, the 
.modified mRNA sequence; 
. (vi) ..if the free energy value calculated in step 
(c) av> is equal, to or more negative than the 
. enersr. value calculated in step . (b) , 

repeating, steps (c)(i) and c(ii) , and, as 
' . . appropriate, steps (c)(iii), (cXiv), and 
i<i>(:^) f ^ttl the condition of (c) (iii) or the 
condition of (c)(v) is satisfied. . . 

7. . The method of Claim 6 wherein the free energy 
calculated, in step (b)- is 'in the range of from zero to 
about. -7i.O kcal/mole. . 

8. The method of Claim 6 wherein the modification 
of . the DM or RNA sequence results in the production of ■ 

. at least five times more protein than produced by . the 
unmodified DNA or RNA. sequence. 

' 9. A DNA or RNA sequence which has been modified 
by the method of Claim 6. 

10. A tranefonned host cell which includes the 
inodified DNA or RNA sequence of Claim 9. 

11. A method for increasing . the translation 
efficiency of a mRNA sequence which (i) codes for a 
protein, (ii) : is produced from, a DNA. or RNA sequence, 
and (iii) includes ,an AUG initiation codon and a 
.Shine-Dalgamo sequence', conJprising the steps of : 

.. (a>. . determining and analyzing a . predicted 
secondary structure for the mRNA sequence to determine. 
if either or both, of the AUG initiation codon and the 
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Shine-l>aigamo sequence is contained in a double 
stranded portion of a stem-loop region of the predicted 

secondary structure; 

(b) calculating a free energy value for the 

stem-loop region; and 

(c) modifying the DMA or RNA sequence for the 
mRNA, if the calculated, free . energy value is in the 
range of from zero to about -10,0 kcal/mole, so that 
when the modified DNA or RNA sequence produces a 
modified mRNA . sequence which has a predicted secondary 
structure wherein either: 

(t) t:he AUG initiation codon and the 
Shine-Dalgamo sequence are not- included in a 
double stranded portion of a stem-loop region 
of the predicted secondary structure; or 

(ii) if either or both of the AUG initiation codon 
and. the Shine-Dalgamo sequence are included 
in a double stranded portion of a stem- loop 
region of the predicted secondary structure, 
the calculated free energy value for such 
stem-loop region is more positive than the 
.. free energy value calculated in step (b). 

12. The method of Claim 11 wherein the free energy 
calculated in step (b) is in the range of from zero to 
about. -7 . 0 kcal/mole . 

13. The method of Claim 11 wherein the AUG 
initiation codon and the Shine-Dalgamo sequence are not 
included, in a double stranded portion of a stem- loop 
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region . of the predicted secondary structure of the 
modified imRNA. . - 

14. : The methbd of Claim 11 wherein the 
-modification of the DNA. or .UNA sequence results in the 
production of at least five times more protein than 
produced hj the unmodified. DNA or RNA sequence, 

15. A DNA: or RNA sequence which has been- modified 
by the method of Claim 11. . 

I€v A transformed host * cell which includes the 
modified DNA or RNA sequence of Claim 15. 
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